Edelta: A Word-Enlarging Based Fast Delta Compression Approach
نویسندگان
چکیده
Delta compression, a promising data reduction approach capable of finding the small differences (i.e., delta) among very similar files and chunks, is widely used for optimizing replicate synchronization, backup/archival storage, cache compression, etc. However, delta compression is costly because of its time-consuming wordmatching operations for delta calculation. Our indepth examination suggests that there exists strong wordcontent locality for delta compression, which means that contiguous duplicate words appear in approximately the same order in their similar versions. This observation motivates us to propose Edelta, a fast delta compression approach based on a word-enlarging process that exploits word-content locality. Specifically, Edelta will first tentatively find a matched (duplicate) word, and then greedily stretch the matched word boundary to find a likely much longer (enlarged) duplicate word. Hence, Edelta effectively reduces a potentially large number of the traditional time-consuming word-matching operations to a single word-enlarging operation, which significantly accelerates the delta compression process. Our evaluation based on two case studies shows that Edelta achieves an encoding speedup of 3X∼10X over the state-of-the-art Ddelta, Xdelta, and Zdelta approaches without noticeably sacrificing the compression ratio.
منابع مشابه
Developmental stage-specific regulation of TCR-alpha-chain gene assembly by intrinsic features of the TEA promoter.
The TCR delta- and alpha-chain genes lie in a single complex locus, the TCRalpha/delta locus. TCRdelta-chain genes are assembled in CD4(-)CD8(-) (double negative (DN)) thymocytes and TCRalpha-chain genes are assembled in CD4(+)CD8(+) (double positive) thymocytes due, in part, to the developmental stage-specific activities of the TCRdelta and TCRalpha enhancers (Edelta and Ealpha), respectively....
متن کاملDdelta: A deduplication-inspired fast delta compression approach
Delta compression is an efficient data reduction approach to removing redundancy among similar data chunks and files in storage systems. One of the main challenges facing delta compression is its low encoding speed, a worsening problem in face of the steadily increasing storage and network bandwidth and speed. In this paper, we present Ddelta, a deduplication-inspired fast delta compression sch...
متن کاملFlanking nuclear matrix attachment regions synergize with the T cell receptor delta enhancer to promote V(D)J recombination.
Previous studies have identified nuclear matrix attachment regions (MARs) that are closely associated with transcriptional enhancers in the IgH, Igkappa, and T cell receptor (TCR) beta loci, but have yielded conflicting information regarding their functional significance. In this report, a combination of in vitro and in situ mapping approaches was used to localize three MARs associated with the...
متن کاملHybrid Coding for Animated Polygonal Meshes
A new hybrid coding method for compressing animated polygonal meshes is presented. This paper assumes the simplistic representation of the geometric data: a temporal sequence of polygonal meshes for each discrete frame of the animated sequence. The method utilizes a delta coding and an octree-based method. In this hybrid method, both the octree approach and the delta coding approach are applied...
متن کاملDelta-K 2-tree for Compact Representation of Web Graphs
The World Wide Web structure can be represented by a directed graph named as the web graph. The web graphs have been used in a wide range of applications. However, the increasingly large-scale web graphs pose great challenges to the traditional memory-resident graph algorithms. In the literature, K-tree can efficiently compress the web graphs while supporting fast querying in the compressed dat...
متن کامل